Lattice segmentation and minimum Bayes risk discriminative training for large vocabulary continuous speech recognition
نویسندگان
چکیده
Lattice segmentation techniques developed for Minimum Bayes Risk decoding in large vocabulary speech recognition tasks are used to compute the statistics needed for discriminative training algorithms that estimate HMM parameters so as to reduce the overall risk over the training data. New estimation procedures are developed and evaluated for both small and large vocabulary recognition tasks, and additive performance improvements are shown relative to maximum mutual information estimation. These relative gains are explained through a detailed analysis of individual word recognition errors.
منابع مشابه
Boosting Minimum Bayes Risk Discriminative Training
A new variant of AdaBoost is applied to a Minimum Bayes Risk discriminative training procedure that directly aims at reducing Word Error Rate for Automatic Speech Recognition. Both techniques try to improve the discriminative power of a classifier and we show that can be combined together to yield even better performance on a small vocabulary continuous speech recognition task. Our results also...
متن کاملMinimum Bayes Risk Estimation and Decoding in Large Vocabulary Continuous Speech Recognition
Minimum risk estimation and decoding strategies based on lattice segmentation techniques can be used to refine large vocabulary continuous speech recognition systems through the estimation of the parameters of the underlying hidden Mark models and through the identification of smaller recognition tasks which provides the opportunity to incorporate novel modeling and decoding procedures in LVCSR...
متن کاملPinched lattice minimum Bayes risk discriminative training for large vocabulary continuous speech recognition
Iterative estimation procedures that minimize empirical risk based on general loss functions such as the Levenshtein distance have been derived as extensions of the Extended Baum Welch algorithm. While reducing expected loss on training data is a desirable training criterion, these algorithms can be difficult to apply. They are unlike MMI estimation in that they require an explicit listing of t...
متن کاملGinisupport vector machines for segmental minimum Bayes risk decoding of continuous speech
We describe the use of Support Vector Machines (SVMs) for continuous speech recognition by incorporating them in Segmental Minimum Bayes Risk decoding. Lattice cutting is used to convert the Automatic Speech Recognition search space into sequences of smaller recognition problems. SVMs are then trained as discriminative models over each of these problems and used in a rescoring framework. We pos...
متن کاملLattice segmentation and minimum Bayes risk discriminative training
Modeling approaches are presented that incorporate discriminative training procedures in segmental Minimum Bayes-Risk decoding (SMBR). SMBR is used to segment lattices produced by a general automatic speech recognition (ASR) system into sequences of separate decision problems involving small sets of confusable words. We discuss two approaches to incorporating these segmented lattices in discrim...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 48 شماره
صفحات -
تاریخ انتشار 2006